Briefly Noted
نویسندگان
چکیده
This book is a diverse collection of ten presentations given at an International Summer School on Information Extraction in Rome, 1997. The goal of information extraction (IE) is selective, task-driven interpretation of text narrative in order to fill out templates with information about a particular scenario. I was disappointed to find that only five of the articles were actually about IE research. The other half of the articles addressed issues peripheral to IE, such as information retrieval (IR) and text classification. The first two articles are by Yorick Wilks and by Ralph Grishrnan, who are prominent IE researchers in the UK and the US, respectively. Each gives a high-level discussion of IE, its successes and limitations. Wilks makes the observation that IE's strength comes from its modular architecture. Individual modules such as part-of-speech tagging or morphology analysis can be constructed and optimized independently and reused in a variety of applications. He sees the primary limitation of IE to be the template representation that restricts the type of information that can be extracted. Grishman describes the typical architecture of IE systems whose modules include lexical analysis, name recognition, shallow syntactic parsing, task-specific pattern matching, coreference analysis, event merging, and finally template generation. He identifies the main challenges to IE as the cost of adapting a system to a new domain or scenario and a ceiling on performance, which is closely related to the issue of knowledge acquisition and difficulty handling complex syntactic structures. Three other articles deal with more specialized topics within IE. Robert Gaizauskas, Kevin Humphreys, Saliha Azzam, and Yorick Wilks describe a system for multilingual IE with some language-independent modules that are indexed by language-specific lexicons. Roberto Basili and Maria Teresa Pazienza discuss corpus-driven lexical acquisition, in particular for the "foreground" lexicon of words that support a particular IE task. Branimir Boguraev and Christopher Kennedy present work in technicalterm recognition and how this can be a step towards document summarization. The remaining articles concern IR, text dassification, or heterogeneous database techniques, and are only tangentially related to IE. Gregory Grefenstette presents an NLPbased strategy for suggesting additional IR query terms to a user. Alan Smeaton gives a tutorial on uses of NLP in IR. Nicola Guarino discusses formal ontologies and how these can enhance IR with semantic matching. Filippo Neri and Lorenza Saitta give a tutorial on machine learning that briefly touches on text classification. Sophie Cluet describes database techniques for querying semi-structured Web pages.--Stephen Soderland, Children's Hospital, Seattle
منابع مشابه
Simulating Societies using Distributed AI
This paper discusses the prospects for using Distributed AI techniques to support the computer simulation of societies. Newly developed ideas and techniques are reviewed, some relevant projects are briefly described, and some potential pitfalls are noted.
متن کاملThe Elastic-plastic Mechanics of Crack Extension
This paper briefly reviews progres~ in the elastic plastic analysis of crack extension. Analytical results for plane strain and plane stress deformation fields are noted, and elastic-plastic fracture instability as well as transitional behavior and combined rate and thermal effects are discussed.
متن کاملMallory's ('alcoholic') hyaline in primary biliary cirrhosis.
Mallory's (;alcoholic') hyaline has been found in hepatocytes in 18 of 70 patients with primary biliary cirrhosis. These inclusions have previously been noted in only three cases of primary biliary cirrhosis. Current views on the nature of Mallory's hyaline are briefly discussed.
متن کاملHow Far Can You Trust A Computer?1
The history of attempts to secure computer systems against threats to confidentiality, integrity, and availability of data is briefly surveyed, and the danger of repeating a portion of that history is noted. Areas needing research attention are highlighted, and a new approach to developing certified systems is described.
متن کاملEquivalence relations and behavior: an introductory tutorial.
With an emphasis on procedural fundamentals, the original behavior-analytic equivalence experiments and the equivalence paradigm are described briefly. A few of the subsequent developments and implications are noted, with special reference to the possible significance of the findings with respect to language and cognition.
متن کاملDevelopment of compound semiconductor detectors at ESA
Some examples of space-borne applications that require improvements in detector technology compared with conventional Si and Ge designs are described. Properties of compound semiconductors are noted, and a range of different detector developments are briefly reviewed. Material fabrication improvements for several compound semiconductors have resulted in near Fano-limited performance.
متن کامل